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S1- 



S2- 



! Recording an analog audio sequence s t representing the voice of a soeaker S- 
interfered by statically distributed backround noise n (t).sald I audio sequence 
including both environmental noise n(t) and weiqhted surr it 7(\\ Z 
persons interfering voices in the environment of gj gg j %* 50—9 

/Recording a digital video sequence v(nT) showing the face of said speaker S- for 
detectmg and analyzing said speaker's lip movements and for Kg|ns j 

Subjecting said an alog audio sequenc/s(t) to an analog -to -digital c onversion 



ponding discrete signal spectrum S(k-Af) of the analog-to- " 
d,g.tal-converted aud,o sequence S (nT) by performing a Fast Fourier Transform (FFT) 




S3- 



of said speaker S; from said signal spectrum S(k-Af) by ana- 
yzmg v.sual features o y , extracted from the video sequence v(nT) t acking said 
lip movements ond for 'facial expressions 9 



I 



S4- 



Estimating the noise power density spektrum ? nn (f) of the statistically distributed 
background no.se n(t) based on the result of The speaker detection^te7 (S3) 



S5 



Subtracting a discretized version ? nn ( k .Af) of the estimated noise power density 
spectrum fyffl from the discrete signal spectrum S(k-Af) of the analoa-to-ai- 
gitol-converted audio sequence sfnT) 9 
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S6- 



Caculatmg the correspond.ng d.screte time-domain signal S;(nT)of the obtained 
def erence signal by performing an Inverse Fourier Transform (IFFT) thereby 
yeldmg a discrete version of the recognized speech signal ' Y 



(je) 



Fig. 3a 
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S8o^, 



Band-pass-filtering the discrete signal spectrum S(k-Af) 



S8b- 



S8a- 



S9- 



S10- 



i 



Performing an amplitude detection of the band-pass-filtered discrete 
signal spectrum S(k-Af) 



Correlating the discrete signal spectrum S T (k-Af) of the delayed 
version s(nT-T) of the analog-to-digital-converted audio signal s(nT) 
with an audio speech activity estimate obtained by said amplitude de- 
tection step S8b,thereby yielding an estimote Sj(f) for the frequency 
spectrum Sj(f) corresponding to the signal s;(t) which represents soids 
speaker s voice as well as an estimate * nn (f) for the noise power den- 
sity spectrum $ nn (f) of the statistically distributed background noise n'(t) 



i 



Corre cting the discrete signal spectrum S x (k-Af) with a visual speech 
activity estimate taken from o visual feature vector o vt supplied by 
the visual feoture extraction and analyzing means 104a+b and/or 104' 
+ 104 ^thereby yielding a further estimate Sj'(f) for opdating the esti- 
mates^) for the frequency spectrum Si(f) as well as further estimate 
*nnW f °r updating the estimate $ nn (f) 



I 



Adjusting the cut-off frequencies of a band-pass filter 204 used for 
filtering the discrete signal spectrum S(kAf) dependent on the band- 
width of the estimated speech signal spectrum Sj (f) 



® 



Fig. 3b 
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S8o- 



Bond-poss-filtering the discrete signal spectrum S(k-Af) 



S8b- 



Performing an amplitude detection of the band-pass-filtered discrete" 
signal spectrum S(kAf) 



SI ta 



Sllb- 



Sllc- 



dl^^ST octivit . y est T te obtained b * the Gm p ,itude 

1041+104 • «-* i*l o " 




Correlating the discrete signal spectrum S(k-Af) with the audio -visual 
speech activity estimate.thereby yielding an estimate Sj(f) for the fre- 
quency spectrum S,(f) corresponding to the signal Sj(t) which represents 
said speakers voice as well as an estimate L(f)' or the noise power 
density spectrum * nn 0 of the statistically distributed backgr d noise n(t) 



I 



Adjusting the cut-off frequencies of a band-pass filter 204 used for 
filtering the discrete signal spectrum S(k-Af) dependent on the band- 
width of the estimated speech signal spectrum Sj(f) 



